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ABSTRACT 



The general two population discrimination problem is 
discussed briefly under various situaticns„ Discrimination 
procedures using the linear discriminant function and a 
nonparametric procedure due to Jo Lo Hodges and E® Fix, which 
classifies a random variable to a population on the basis of 
assigning it to the population which has the nearest obser- 
vation to an observed value of the random variable are 
discussed and compared by computing the probabilities of 
misclassifieation for both procedures when the two popu- 
lations are normal with equal covariance matrices « Proba- 
bilities of misclassifieation are compixted for the 
nonparametric discriminator and the linear discriminant 
fxmctlon for two small sample sizes for the case when the 
two populations being discriminated are exponential » In 
this latter case, both discrimination procedures are shown 
to give high probabilities of misclassifieation for certain 
values of the parameters of the distribution being discrimi- 
nated, Regions are given in terms of the parameters of the 
two exponential distributions where one of the probabilities 
of error is greater than 0,5o A more complete Investigation 
for larger sample sizes is recommended for the linear dis- 
criminant function and the nonparametric procedure dis- 
cussed in this paper for the case when the two populations 
being discriminated are exponential. 
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SECTION I 



INTRODUCTION 



The two population discrimination problem may be summa- 
rized as follows s given a random variable Z distributed 
over some p “dimensional space according to a distribution Fj, 
or according to a distribution Gp determine on the basis of 
an observation^ say z of Zp which of the tv/o distributions 
Z haso 



IVhen P and G are completely known^ the solution to the 
problem is implicit in the Neyman- Pears on lemma»(l) The 
discrimination depends on the ratio f(z) where f and g are 

st^ 

the respective density functions of P and Go The rule is 
as follows t 



If 


f (z) 






If 


A 

O 


If 


f (z) _ 




gliT - 



decide in favor of P 
decide in favor of G 
C, the decision is arbitraryo 



C is an appropriate positive constant chosen on the 
basis of consideration relating to the importance of the two 
possible errors : 

(i) Pt, = P (Z is assigned to G I Z came from P) 
i 

(ii) P^ = P (Z is assigned to P | Z came from G)o 
The two most widely advocated choices of C are? 

(a) Take C = 1 

(b ) Choose C such that P = P , 

12 
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This procedure, knoim as the "likelihood ratio pro- 
cedure" is known to have optimum properties with regard to 
control of the probability of misclassif icati on« 

When P and G are known except for the values of one or 
more parameters, the procedure used is much the same as that 
just describedo Under the assumption that P and G are known 
except for one or more parameters and if we can asstuae that 
samples are available say; 

,000, X f roic P 

123 m 

Y. , Y , Y , o « e s Y from G 

123 n 

I 

we are able to estimate the unknown parameters, denoted col- 
lectively by 00 By some estimation procedure, we can esti- 

A 

mate 0 by 9 and assuine that F<^^and G^ are the correct 
distribution functions. The "likelihood ratio procedure" 
and the decision rules outlined above can now be applied. 

If It is ass\amed that P and G are p-variate normal 
distributions having the same (unknovm) covariance matrix 
and unknown expectation vectors, the linear discriminant 
function Is a good example of this procedure, ( 2 ) The given 
samples are used to estimate the covariance matrices and the 
expectation vectors and the "likelihood ratio procedure" is 
used under the assumption that the estimated parameters are 
known to be correct. It is known that \mder the normal as“ 
sumption for P and G and the homoscedastlc assumption that 
the linear discriminant function Is an optimal procedure. 
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Although this procedure seems reasonable when the 
parametric form of the distributions is correct or the as- 
sxrnied form is correct^ there is concern about the validity 
of this procedure when the linear discriminant function is 
used with data not normal^ or if normal^, with unequal 
covariance matrices « In fact in the normal situation when 
the covariance matrices are not equals a quadratic function 
can be shown to be optimal. There is a need then for a 
reasonable discrimination procedure whose validity does not 
require the knowledge implied by the normality assijmption^ 
the homoscedastic assumption or any assumption about the 
parametric form. 

Several classes of nonparametric discrimination pro- 
cedures were proposed ir (3)o These procedures were proven 
to have asymptotic optimiam properties for large samples. 

In (i|). some of these nonparametric procedures X'^ere investi- 
gated when the samples were small. These procedures were 
compared with the linear discriminant ftinctlon where P and 
G were assumed normal with equal covariance matrices since 
Tinder these assumptions the linear discriminant function is 
known to be optimal. A comparison was made by comparing the 
probabilities of mlsclassification when the linear discriminant 
function was used against the probabilities of misclasslfi - 
cation when the nonparametric procedures were used. A 
survey of the procedures and results of (i|.) are given in 
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Section II of this papero 

In. Section III of this papery an investigation is made 
of the performance of one of the nonparametric discriminators 
discussed in ( 1|) and of the performance of the linear 
discriminant function when P and G are not normal but^ in 
facts, exponential with parameters X U respectlvelyo 
The exponential distribution was selected because of the role 
it plays in the field of life tes tings, and other applied 
problems o It is shown that for sample sizes of 1 and 2^ 
that both the nonparametric discriminator and the linear 
discriminant function give very poor results for certain 
values of X ° 

Detailed conclusions and recoiramendations made on the 
basis of the results attained In Sections II and III are 
contained in Section IV of this paper® 

Professors R® R® Read and J® R„ Borstingp of the U® S® 
Naval Postgraduate Schools, have generously given their time 
to provide direction^ encouragement and valuable advice to 
the author in the writing of this paper® 
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SECTION II 



PERFORMANCE OP THE LINEAR DISCRIMINANT FUNCTION 
AND A CLASS OP NONPARAMETRIC DISCRIMINATORS 
WHEN THE TV/0 POPULATIONS BEING DISCRIMINATED 
HAVE NORMAL DISTRIBUTIONS WITH 
EQUAL COVARIANCE MATRICES 

Let XjXsoooiiX be a sample from a p -variate d.iatri-' 
12 m 

butlon P and let Y^s> Y_i ,90 6j,Y be a sample from a p^variate 

X 2 11 

distribution Go It is assumed further that the parametric 
forms of P and G are unknown o If z is an observation of a 
random variable Z known to be either distributed as P or G^ 
how is it decided on the basis of z to which population Z 
belongs? Define a distance function (In p-dimensional space) 
which will permit a ranking of the m+n observations ac- 
cording to their "nearness** to Zo The idea of the discrimi- 
nation procedures outlined in (3) is to assign Z to the 
population which has the most observations nearest to z« 

Specif icallys choose an odd integer^ kp and assume for sim^ 
plieity that m=np then Z is assigned to the distribution 
from which came the majority of the k nearest observations <> 

In (3)j> it was shown that several classes of these non- 
parametric discriminators have asymptotically optimum per- 
formance as m“5>00 and n->00 at the same rateo By optimum 
performance, it is meant that the probabilities of misclassifi- 
cation P^ and P^p as defined In the introduction, tend to 
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the theoretical minimuiri values which they could have if F 
and G %fere completelj? .known „ 

The asytaptotic properties and the simplicity of ap- 
plying the procedures of this class of nonparametric dls>- 
criminators suggest that this type of procedures might be a 
reasonable alternative to the commonly applied linear dis= 
criminant functiono However^ to propose an alternative to 
the the linear discriminant function solely on the basis of 
asymptotic properties and ease of application would not be 
entirely reasonable o In particularp the small sample per- 
formance of suc.h nonparametric discriminators needs investi- 
gation to ascertain how much discrimination power is lost 
when F and G are known to be normal with equal covariance 
matrices so that the linear discriminant function is ap- 
propriate, One way this Investigation can be accomplished 
is by comparing the probabilities of mlsclassif ication v;hen 
the linear discriminant f-unction is used with the corre- 
sponding probabilities of mlsclassif ication when the non- 
parametric discriminators are used. Such an investigation 
was made in (4)o The remainder of Section II is devoted to 
siawmarizing the procedures and results of (ij.). 

It is first pointed out that the problem can be reduced 
considerably by considering linear transformations in the 
observation space. It is always possible by such transfor- 
mations to insure that P and G will have the identity 
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covariance matrix* In other words In the new spa*" “ the p 
transformed measurements are independent in eain population 
and each measurement has a uxiit variancfe is also possi 

ble by such transformations to pat th-; expe tat4*.n vector of 
the P population at the origin and ^he expectation vector of 
the G- population on the positive firs'^ axis.. This al 1, :,ws 
complete specification of the transformed population by th-^- 
two param.6ters p and X where 

A ■= E (first coordinate of Y) 

= distance between the means of the 
transformed pcpula t,io.ns ^ 

In performing such linear transfornations , end for the 
linear discriminant function are unchangecis The proba- 
bilities P and P for the nonparametric. discriminators are 
1 2 

likewise xmchanged since such linear transformations map 
the totality of distance functions one -one Into the totality 
in the new space* 

It is assumed that the sizes of the samples taken from 
each population are equalj, m=n* In the main, the distance 
function used is 

P 

l^{x.,z) = na.%. jXj Zj I 

i-=l ' ' 

It should be pointed out that A rs just one of a large class 

of distance functionsj, anyone of which could be used* This 

fact is mentioned since the probabilities P, and P depend 

12 



7 



very heavily on the distance function chosen. Most of the 
computations are made using k=l 3 that is^, assign Z to the 
population P or G from which came the individual of the 
pooled samples which most closely resembles Z® 

The first case considered is the univariate case, p=l» 
Using the rule of the "nearest neighbor"; that is, k=l, and 
the distance function /\ = \ x-zj, which corresponds to ordi- 
nary Euclidean distance in this case, the probabilities 
and are computed for various values of n and A® 

‘.For p=l, the linear discriminant fimction is greatly 
reduced since no matrix oomputatibh enters® The arithmetic 
mean X+Y of the sample means is computed and Z is assigned 



to that population whose sample mean lies on the side of 

X-fY as does Z itself® The probabilities of misclassif i- 
2 

cation are now readily computed. 

Prom the symmetry of the problem, suf- 

ficient to compute P , thus, it is assvimed that Z is distri" 

1 

buted according to the P distribution® As was pointed out 

previously, linear transformations make it possible to put 

2 2 

E(X)=0, E(Y)=A>0 and (f = CC-=1 with no loss of generality® 

A X 

An error is committed by the linear discriminant 
function if and only if, 

(i) Z > X+Y and Y > X 
(il) Z < X+Y and Y < X. 



Define U=Y-X and V~ X+Y-2Z® It is easily shoivn that U 
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very heavily on the distance fimction chosen. Most of the 
computations are made using k=lg that is, assign Z to the 
population P or G from which came the individual of the 
pooled samples which most closely resembles Z, 

The first case considered is the univariate case, p=l. 
Using the rule of the "nearest neighbor"; that is, k=l, and 
the distance function /\ = | x-z|, which corresponds to ordi- 
nary Euclidean distance in this case, the probabilities P 

X 

and are computed for various values of n and A. 0 

’ . Por p=l, the linear discriminant function is greatly 

reduced since no matrix computatldh enters. The arithmetic 

mean X+Y of the sample means is computed and Z is assigned 

to that population whose sample mean lies on the side of 

X+Y as does Z itself. The probabilities of misclassifi- 
2 

cation are now readily computed. 

Prom the symmetry of the problem, P =P so it is suf- 

X 2 

ficient to compute P , thus, it is assumed that Z is distri- 

1 

buted according to the P distribution. As was pointed out 

previously, linear transformations make it possible to put 

2 2 

E(X)=0, E(Y)-A>0 and CL— CC=1 with no loss of generality, 

A. i 

An error Is committed by the linear discriminant 
function if and only if, 

(i) Z > X+Y and Y > X 
(il) Z ^ j+Y and Y < X, 

Define U=Y-X and V= X+Y"2Z, It is easily sho^^n that U 
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and V are independent normal random v; riajle? Hith E(U)=A,., 
= 2/n,- E(V)~^^cT^ - 4 2/no tr,; is of vo-'iables 

U and Vj an error is committed by tne linear discriminan* 
ftinction if and only If UV<Oo Thus iolliws for the 
linear discriminant function when p~l , tJoar , 



P = P = 
1 2 



0 



Vn A 
"VF 



0 



VF A 
VFFn 






'ifiX' 

VT I 



ro 



_Vn^ 



where 





Since Lim P = (t)(“ ~)j, it is observed that the maximum proba- 

ID -e « 2 

bility of misclasslf Icatlon Is «5o The values of P =P for 

-L d 

various values of n andA are given in Table io Figures 1 



and 2 give these results graphically^ All Tables and Figures 
in Section II have been reproduced from (4)o 



We consider now the nonparametri c discriminator using 
the "rule of the nearest neighbor^ '* k=ln which consists of 



assigning Z to that population from which came the sample 

individual nearest to z» Suppose that Z=Zo Let 

denote the conditional probability that the nearest of the 



2n sample observations to 2 is a y,, given Z"Zo Then, 



P 



1 



E (P^(Z)) = 
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TABLE 1 



PHOBABILIIY OF ERROR ^ LINEAR DISGRMINANT FUNCTION, 
UNIVARIATE NORMAL DISTRIBUTIONS 




n = size of sample taken from each population 
A = distance between the means of the two populations 
Probability of error - P (Z is assigned to G | Z came from P) 
= P (Z is assigned to P I Z came from G) 



10 



p. 







A 



,2 




A --2 



A -3 



o 



1 




5 10 20 30 SO 

FIGURE 1 



100 

n 



Probability of error P^ of the linear discriminant 
function for two univariate normal distributions with 
distance between means = X <> 
n = size of sample from each population. 
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FIGURE 2? 

Probability of error of the linear discriininant 
function for two univariate norinal distributions with 
distance between the laeans =A >. plotted as a function of 

A o 

n = size of sample from each popalatlo’^. 
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It remains then to calculate P. ( z ) . Define 

X 

h 2_( 6 ) = PT X “ z| < 6 ) 6 > 0 

= P(z-(5<x<z;+(5 ) 

= (j)(z +6 ) “ (|)(z “ (5 ) 1 

and K^( 6 ) “ P(| f - z| < (5 ) 

= p(z “(5 <y-A< z ~A + 6 ) 

“ <|)(z ^ A + 6 ) “ <t(z "A “ 6 

The eventy "the nearest sample value to z is ay" can 
be classified Into the n exclusive equiprobable events^ "the 
nearest sample value to z is y^p 1=1» 2p ooo» no Since the 
nearest y to z will necessarily be the minimum y, it is 
necessary to compute the probability density function for the 
minimum of |Y^“z| p ® I ^n”^ I ° 

I Y^-z I p i = lp2p ooop Up are independent identically dis- 
tributed random variableSp this density function is easily 
shown to bep 

n(l - K^( 6 ) ( 6 ). 

P^ (z) is then computed by the following formula: 

1 ,CXD 

P^(z) = n / (1 ° H^( 6 ))’^U “ 6 )) dK^(6) (2). 

Formulae (1) and (2) form the basis for all the computations 
for the "nearest neighbor nrle" no matter what the value of 
p if for p > 1 on© replaces Pf | x - z | < 5 ) by P (the distance 
of X from z < 5 ) similarly P(|Y-z|<(5 )byP (the 
distance of Y from z < (5 )° course the specific evalu- 

ations depend upon the distance function used. 
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Except for the case p=lj n=lj, in which case P and P 

1 

are the same for the linear discriminant function and the 



nonparametric discriminatorj, the bulk of the computations 
for the nonparametric discriminator were carried out by 



straightforward numerical integrationo These computations 
are given in Table 2» These computations are quite heavy, 
especially for the case p~2o Therefore, a. search for an 
approximation formula for the computation of P^(z) was 
institutedo One approximation formula was found which gave 
very good results o A discussion of this approximation 
formula is given in (ij.), P^ as computed using the approxi^ 
mation formula for P^(z) is tabled in Table 2-=Ao One very 
interesting result which was obtained using the approxi- 
mation formula for P^(z) was that for large n. 



. oo 







t g(Z) 

[flz I + glz) 



f (z)g(z) (^2 

Z) + g(z) 



■’ C-OO 

An application of Schwartz »s inequality shows the latter 

integral to be at most 0o5o It is thus possible to assert 

that, whatever be the populations being discriminated, the 

"rule of the nearest neighbor” will have in the limit as 

m = n — ^ 00 equal probabilities of error at most 0»5o 

To compare the figures of Tables 1, 2, and 2- A, the 

values of P = P for paired values of X plotted against 

1 2 

n in Figure 3o In Figure I4., the same values are plotted 
against X selected values of no 
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TABLE 2 



PROBABILITY OP ERROR. NONPARAIIETRTC DISCRIMINATOR 
WITH k=l,. MIVARTATE NORMAL DISTRIBUTION 



n 


X =1 


A “2 


A ^3 


1 


.ill 75 


«2532 


„123S 


2 


<,[|.o 86 


.2364 


oi084 


3 


oi|052 


o 230? 


0IO36 


k 


o4032 


o2280 


.1014 



TABLE 2-- A 

APPROXIMATE PROBABILITY OP ERROR, NONPARAMETRIC 
DISCRIMINATOR WITH k-1^ UNIVARIATE NORMAL DISTRIBUTION 



n 


A=i 


A-2 


A -3 


4 


o403 


o226 


.102 


5 


H 

0 

0 


c225 


.100 


10 


o399 


o223 


c 

0 

CD 


20 


o398 


o224 


0 O 98 


50 


o398 


'^25 


0 

0 

CD 


00 


o398 


.225 


.098 



n “ sisse of sample from each populatlofi 
X “ distance between the means O'f the two populations 
Probability of error = P(Z is assigned ‘■o G i Z came from Pi 
~ P(Z is assigned to P | Z c^e from G) 



li 



.s 



I' - 

A 



.3 




i 




O 1. 

I 



Z 3 5 to 20 



i. ^ 

30 so 



/cx> 

n 



FIGURE 3 

Comparison of th© probability of error P as a f\mction of 

X 

n for the linear discriminant function and the nonparametrlc 
dlscrlminatorp distance function /\ » k=lj, for two normal 
\inivariat© populations with distance between means = X » 
n = size of sample from each population 
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!5 









FIGURE ij. 

Compar-lson of the probability of error as a function of 
^ j, the distance between the meanSj, for the linear dis- 
criminant function and the nonparametric discriminator^ 
distance function = ^ , k^l^ for two normal univariate 
populations 

n = size of sample from each population 
n = 1 is identical for both 

indicates the nonparametric procedure 
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Not discussed in this papei% but investigated to a very 
limited extent In (I4.) are the following cases: 

(1) the nonparametric discriminator using A as a 
distance function with k ^ 3 for the univariate 
and bivariate normal distributions 
(ii) the nonparametric discriminator using A as a 
distance function k = Ij, n = 1 for p = 2 
(iii) the effect of distance functions other than A 

the probabilities of misclasslfication for bivariate 
normal distrdbution 

Although the investigation of the above cases was ex“ 
tremely limited due to the laborious computations^ the 
results that were obtained indicated that the nonparametric 
discrimination procedure gave ’’reasonable'* error probabili- 
ties in both cases (i) and (ii)o In the bivariate normal 
distrlbutionj, different distance fxinctions produced vastly 
different error probabilities in some situations o 
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SECTION II T 



PERFORMANCE OP THE LINEAR DISCRIMINANT FUNCTION 
AND A CLASS OF NONPARAMETRIC DISCRIMINATORS 
WHEN P AND G ARE EjCPONENTIALLY DISTRIBUTED 

In this section^, a limited investigation of the linear 
discriminant function and the nonparametric discriminator 
using as a distance function and using "the rule of the 

nearest neighbory " k-lp is made when P and G are not normally 
distributed; but In facty exponentially distributed with 
parameters A and /i respectively o The performance of 
both the linear discriminant function and the nonparametric 
discriminator will be investigated again by computing the 
probabilities of mlsclassificationo Under the assumption 
that F and G are exponentially distributed, it will be shown 
that the linear discriminant function and the nonparametric 
discriminator using A as a distance function and "the rule 
of the nearest neighbor" can give high probabilities of 
mlsclassificationo 

Throughout the remainder of the section, it will be 
assumed that m = n and that P and G are exponentially dis- 
tributed with parameters X and ^ respectivelyo Because 
of the heavy computations involved in computing the probabili- 
ties of misclassifieation; 

(i) “ P (assigning Z to G j Z came from P) 

(11) - P (assigning Z to P | Z came from G) 
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the only cases investigated will be for p=l and n=lj,2» 

end will first be computed for the linear dis- 
criminant functiono The procedure here is precisely that 
which was used In Section II for p •= lo One simply computed 
the arithmetic mean X + Y of the sample means and assigns 
Z to that population whose sample mean lies on the side of 
X L as does z itself a While P, / P^., it is only neces- 

sary to compute since V can readily be computed from P^ 
by interchanging X stnd jj « 

Proceeding as in Section Up define the new variables 
U = Y - X and V “ X + Y = 2Z, If U and V are to be Inde- 
pendentp it is necessary that the covariance of U and V be 
zeroo Computing the covariance of U and V we have: 

X I ? 

Gov (Up V) -- ( I ^0 except for A = U, 

n ' A /i ^ 

Since discrimination is not possible for A = /i » the 
CovCUpV) will not be zero and in general U gjid V will not be 
independent e As before^ an error Is committed by linear dis- 
criminant function if and only if, 

(i) Z > X Y and Y > X 

“T“rr^ 

(ii) Z < _X + Y and Y < X , 

In terras cf the variables U and V„ an error is comraitted if 
and only if UV < 0« and thereforep 

P^ = P (UV < 0)o 
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Since U ajid V are not i.n general Independent 3 the probability 



that UV < 0 is not easily 'Compatedt, Is necessary to com- 

pute the joint density funciicn for U and V and Integra '■e 
over the region where UV < Ou The Joint density function 
of U and V was computed but because of the complex nature of 

this function, it was considered easier to comptrte di - 

rectljo By (i) and (it) and the definition of it follows 
that. 

Pi = P (Z>X_^_Y:, Y>X) + P (Z<X_+_Y„ Y<X). 

X — 2 -' 2 . 

Let T = nY and S nX and thus, 

f„, is the gaiitma density function with parameters n and/f. 

f„ is the gamma density function with parameters n and A, 

O 

Sine© Ty S„ and Z are independent random variables, 

00 

f (z) f (t) f (s) dz dt ds 

> 5 ^ i o 

^alrT 




P 



1 



n=l 



iz) (t) f'g (s) dz dt 

can now be computed by direct mmierical integrationo 
, P^ as a function of A and //is. 




ds . 
For 



P 

1 



J^iio All 2^^% 
3 ( 11 + 2 A hU 



15A U) 

^ A jtl + 



o 



XT 



By Interchanging A (Ji ■ 

( A , M > = 



is , 

p, i/i A ) 
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Recognizing that the numerator and denominator in the ex 



pressions for P and P are homogeneous of degree 3 in A 

1 2 

and// 3 and can be expressed m terms of a single 
parameter c by setting A " c // „ Making this substitution 
In the expressions for and P^ we have^ 

P, (c) = (10c ^ 1 - I5c + 2) 



^2 



1 



(I) 



Por n~lg P^ and for the linear discriminate function are 
the same as P^ and p^ lor the nonparametric discriminator 
using A as a distance function and "the nearest neighbor 
rule^ k=l o '* 

Por n=2j, the substitution A ~ '^ // is again appropri^ 
ate and P^ and P^ for n“2 are as follov/s^ 

P (c) ^ 128c^(2c + 3) , (3c + 1) 128 (l|.c + 1) 



(c + ij.) (3 g +2) 



(c + 1) 25(3c + 2) 



P2 (c) 



Values of P_ and P, for the linear discriminant function 
1 2 

for n”l and 2 are tabled for vardous values of c in Table 3» 

P^ and Pg are next computed for the nonparametric dis° 

crlminator for the case n-2c The procedure used is exactly 

the procedure used in Section IIo The substitution A~<^//> 

is once more appropriate « P, and P In terms of a single 

t 2 

parameter c are as follows j 
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k 



(■t 




4- 



P^(c) 



™ (30c 2 38e 112) 

-TFT““2TI‘^-r-n“ 



(32 + 2L\.c 56c^ ° 12c3) 

3(3c + 2)(c^ - 1) 








l)(2c + 1) 






(112 - 52c = 30c^) 

”TJi:c + I|.7{c' :=“1T“ 



ij-Qc + 
3Hc“+ 



8 ) 

TiT 



for 0^2 



P,(0) 



(30c^ - ^.8c - 112) 

“X5Xc’'TinT~“rr 



:32 -I- 2lj.c - 56c^ 12c3) 



3(3c + 2)(c' 



1) 



(2c^ 4- 16c^ 
(2c + l)(c‘ 
(3c + 1) 

(c + 1)^' 



2cj_ 

1 ) 



. {2k - kc - lOc^) 

3Tc” tRr-^ 



^(3c + 2J 



for c = 2 



= Mli 



Values of and for various values of c for the non^ 

parametric discriminator vrith n-2 are given in Table 3» 

It is observed in Table 3« that P^ and P^ exceed 0«5 

for n\mierous values of Co Because of this observation, an 

Investigation was made to determine the values of c for 

which P^ and P„ exceed 0o5<> Figure 5 displays graphically 
i 2 

the regions in the As /i plane where P^ and P^ are greater 
than 0o5o 

Figure 5i> points out only too well that great caution 
should be used when applying the linear discriminant in 
situations when the populations are other than normal. 
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TABLE 3 

PROBABILITIES OP ERRORp Ul^IVARIATE 
EXPONEMTIAL DISTRIBUTIONS 











G 


1 


2 


3 






10 


Linear 

Discriminant 


Pi 


o 5 ooo 


<.4000 


«3262 


a 2741 


o2360 


-1385 


Punction-:5-j) n=l 


*■2 


c 5 ooo 


o5333 


o52l4 


o5037 


o4870 


o4329 


Linear 

Discriminant 


U 


o 5 ooo 


o3736 


o 2652 


„2009 


ol567 


0 O 627 


Punction^ n=2 


P 2 


c5ooo 


o5299 


o5o4i 


o4782 


o4577 


o4o56 


1 

Nonparametric 


h 


o5ooo 


o 4222 


»3559 


0 3066 


o2692 


«1675 


Discriminator a 
n=2 


P 2 


o^OOO 


o5003 


o4^66 


c4295 


o3706 


o 328 l 







c Is a parameter such that X pL 
X is the parameter of the P population 
fJL is the parameter of the G population 
= P (assigning Z to G | Z came from P) 

P 2 = P (assigning Z to P | Z came from G) 
n = sample size 

-)c-For n=lci the probabilities of error P^ and P^ for the linear 
discriminant function are equal to the corresponding probabili- 
ties of error P^ and P^ for the nonparametric discriminator o 
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FIGURE 5 

Values of A and /i for which and exceed 0«5 
c = parameter such that X - c fJi 
A = parameter of F distribution 
pL = parameter of G distribution 

P = P (Z is assigned to G | Z came from P) 

X 

Po = P (Z is assigned to F 1 Z came from G) 

H 

n = sample size 

-::-Linear discriminant function is equivalent to the non= 
parametric discriminator for n = lo 
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SECTION IV 



SUMMARY AND CONCLUSIONS 

In any discrimination problem one has a choice tetvjeen 
using parametric or nonparametric procedures o This choice 
in general will depend upon three factors : 

(i) the strength of the users belief in his parametric 
model o 

(il) the loss that would be suffered by using the non<= 
parametric rule if in fact the parametric form is 
correcto 

(iii) the loss that would be suffered by using the 

parametric rule if the actual densities depart 
from the paranetric form assumedo 
For the tw population discrimination problem^ Section 
II of this paper concerned itself with (ii)o In Section Ilg 
it was assumed that the two populations being discriminated 
were normal with equal covariance matrices o For the univari» 
ate case, the parametric procedure used was the well knovm 
linear discriminant function which is known to be optimal 
in this situatlono The nonparametric procedure used was the 
rule x^hereby a random variable was classified as belonging 
to the population which had the nearest observation to an 
observed value of the random variable being classified* A 
comparison of these two procedures was made by computing and 
comparing the probabilities of misclassif ication* 
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Also for the tx-jo population discrimination problems an 
Investigation of the linear discriminant function and the 
same nonparametric procedure was carried out xAien the tv;o 
populations were not normal but exponential,, Again the in= 
vestigation was made by computing the probabilities of mis=> 
classification for both procedures « This investigation was 
made in Section III of this papero Because of the lengthy 
computations Involved in computing the probabilities of error 
for both of these proceduresp the only cases considered v;ere 
the xmlvariate case for sample sizes of 1 and 2, It was 
shown that for the two cases investigated^ sample sizes of 
1 and 2 d that both the procedures could give poor results 
depending on the parameters of the distributions o 

In conclusionD it seems reasonable that if the popu= 
lations to be discriminated are well known^ and have been 
investigated to be such that the normal distribution gives 
a good fit and that the variance and correlation do not 
change much when the means are changedD and if the classifi^ 
cation to be made warrants the labor of matrix inversion^, 
then the linear discriminant function should be usede How= 
ever^ if the populations are either not well knovaij or are 
knovm not to be approximately normal or to have very differ= 
ent covariance matrices | or if the discrimination is such 
that small decreases in probability of error are not worth 
extensive computationsD then a nonparametric procedure seems 
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to be advisable® Which nonpar-ametric procedure is a matter 
of choice for the user. 

Recommendations to be made on the basis of this paper 

are ; 

(i) tabulate the probabilities of error for the linear 
discriminant fvinction in representative situations 
for the case vrhere the populations being discrimi^ 
nated are multivariate norraal with equal covari° 
ance matrices® 

(ii) further investigation (for larger sample sizes) 
of the linear discriminant function in the case 
where the populations being discriminated are ex= 
ponential because of the importance of the 
exponential distribution in the field of life 
testing and other applied problems. 

(iii) Investigation as to the effect of other distance 

functions for the nonparametric discriminator dls= 
cussed in this paper in the case when the popu“ 
lations being discriminated are exponential or 
some other class of distributions . 
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